Global Scientific Research on SARS-CoV-2 Vaccines: A Bibliometric Analysis

Objective We performed this bibliometric analysis to identify global scientific research on the SARS-CoV-2 vaccines. Materials and Methods This bibliometric analysis study inclusive search of English-language publications related to the SARS-CoV-2 vaccines was conducted in the Scopus, PubMed, and Dimensions databases without year limitations. The results of bibliometric analysis comprised a time-dependent citation density trend, the name of the journal, journal impact factor (IF), year of publication, type of article, category, subscription or affiliation, co-authorship, and co- occurrence network. Results A study of the scientific literature from three databases (Scopus, PubMed, Dimensions) shows that investigators have focused more on studying the structure of the coronavirus at different levels (organismic, cellular, and molecular). In addition, the method of virus penetration into the cell and features of the influence of coronavirus on animals are well-studied. Various methods and strategies are being used to develop the vaccines, including both animal-tested methods and computer models. The Dimensions database is the most representative in terms of coverage of research on development of the SARS-CoV-2 vaccines. Conclusion This research is a scientific investigation based on bibliometric analysis of papers related to the SARS-CoV-2 vaccines. The Dimensions database provides the most representative research coverage on the creation of a vaccine against coronavirus. It is characterized by a large number of formed verbose terms (length of more than four words) related to coronavirus, which makes it possible to track trends in the development of methods for creating a vaccine.


Introduction
The COVID-19 outbreak has caused many economic and psychological effects, and many casualties (1). This virus has spread worldwide with indescribable speed over a short period of time. According to experts, the COVID-19 pandemic could last for years; hence, numerous scientists worldwide are working to eradicate this virus as soon as possible (2). After the increase in cases and global spread, the World Health Organization (WHO) announced that the new coronavirus is the sixth public health emergency worldwide (3).

Diagnosis of a COVID-19 infection is generally
based on laboratory and radiological assessments, and radiological examinations are extremely important in early diagnosis and treatment of this disease (4). Severe lung damage due to COVID-19 infection has resulted in high mortality rates in patients who are infected and requirements for mechanical ventilation are also high (5). There is no specific antiviral treatment for the COVID-19 infection, and the mainstay is supportive care that includes sustaining vital signs, oxygen therapy, and the reduction of complications such as multiple organ dysfunction and failure (6). Due to the lack of standard treatment and effective vaccines for this infection, prevention of infection is the best recommendation.
A vaccine is a biological preparation that protects the body against certain infectious diseases. Vaccines usually contain a pathogen, which is similar to the microorganism that causes the disease and is often obtained from a sample of weak or dead microbes, toxins, or one of its surface proteins. Vaccines are either for prevention (to prevent or help cure an infection by a natural or artificial pathogen) or for treatment (such as a cancer vaccine that has not yet been discovered). SARS-CoV-2 vaccines fall into two groups of genetic vaccines that use one or more of the genes of the coronavirus to stimulate an immune response or a vaccine that carries the virus where the virus is used to deliver the corona virus gene to cells and stimulate an immune response (7). Studies for SARS-CoV-2 vaccines development are ongoing; despite significant progress in vaccine development, challenges still exist (8). The development of a safe, effective vaccine is a long and complicated process that typically takes 10 to 15 years (9). Currently more than 100 candidates for the SARS-CoV-2 vaccines are in various stages of development and a small number are in the early phases of human clinical trials (10). SARS-CoV-2 vaccines approved by the WHO are in clinical trials (11) and a close competition exists between them to achieve a positive result.
In October, 2020, the US Food and Drug Administration (FDA) approved an antiviral drug, Remdesivir (GS-5734), for the treatment of patients hospitalised with COVID-19. This is the first and only approved drug for treatment of COVID-19 in the United States. Remdesivir, an intravenous (IV) injectable drug, inhibits the substances that increase viral replication. Experts warn against the simultaneous use of this drug with hydroxychloroquine because hydroxychloroquine inhibits the therapeutic effects of Remdesivir (12). Remdesivir was originally developed to treat Ebola, but it was not effective and eventually discarded. This appears to be happening again in patients with COVID-19 infection.
The results of recent studies where Remdesivir was used to reduce the complications of COVID-19 infection showed that this drug had little effect on patient recovery (13).
Bibliometric analysis is a tool to determine the status of research conducted in a particular field (14). Trends and possible gaps in knowledge play an important role in management and decision making in science and technology (15). Bibliometric analysis mainly allows the development of analytical methods and bibliometric indicators from statistical criteria, and it is a tool that manages information records related to publications, citations, patents, reports, etc. (16). This analysis also provides additional information about data such as author(s), affiliation(s), and keywords, in addition to integrating information to develop research areas on a specific topic or disciplines.
Despite rapid response from scientists during COVID-19 pandemic, vaccines and antibody protection are still out of reach; however, in acute cases, the US FDA may allow emergency use of promising vaccines that have not yet fully passed safety tests (17). However, for at least six months, researchers will not know the benefits of a vaccine. People exposed to the virus should hope to strengthen their immune system and receive supportive care from doctors and nurses to fight this disease.
The necessity and importance of the present research is that the findings, which include articles from a time period on SARS-CoV-2 vaccines, can show the status of research in this field, reference resources, and reveal the strengths and weaknesses of these researches. Future researchers can fill the information gap in this field by conducting research. In this study, we intend to identify global scientific research on SARS-CoV-2 vaccines by using bibliometric analysis.

Search method and strategy
We performed a search in the Scopus database by October 2020 based on a protocol published by Mecenas et al. (18) in 2020. We searched the Scopus database for titles, abstracts, and keywords. A total of 1659 publications were found for 2019-2020 (1657 publications for 2020, 1 publication for 2019). We also performed a search in the PubMed database on 22/07/2020 and located 6727 articles, of which 6225 were published from 01/01/2019-31/07/2020. We searched the Dimensions database for "Vaccine coronavirus" in titles and abstracts and found 2326 publications for 2019-2020 (2169 publications for 2020, 157 publications for 2019). Of these, publications from PubMed -1289. Table S1 (See Supplementary Online Information at www.celljournal.org) provides the detailed search strategies for each selected database.

Data extraction
Data collection in this study was conducted with a number of articles and by using a researcher-made form appropriate to the objectives of the research. The studied variables included: number of universities, number of journals in each university, number of articles published, number of citations, countries, publication types, first author and contact author, and the number of articles published in each of the fields of SARS-CoV-2 vaccines. We used the VOSviewer toolkit to conduct a co-occurrence analysis for the Scopus, PubMed, and Dimensions databases. An assessment was made of the intensity of the use of one term with another. The minimum threshold for cluster formation was set in a different number of terms for different databases.

Statistical analysis
For data processing, Excel software and descriptive statistics indicators such as mean value were used. (version 1.6.15, Leiden, The Netherlands) was used for visualization. A P<0.05 was considered as significant. Table 1 lists the ten top-cited results. There were 1897 citations and three papers had at least 200 citations. The first paper had 514 citations and was published by Wrapp et al. (19) in the Proceedings of Department of Molecular Biosciences, University of Texas at Austin (Austin, TX, USA).

Journals
The "Journal of Bimolecular Structure and Dynamics"has an extremely large number of contributions to COVID-19 research with 33 publications followed by "Nature" with 15 papers and "Medical Hypotheses" with 14 papers. Altogether, the ten highest-ranking journals issued 129 articles, which accounted for 17.25% of all publications in this area from a total of 748 (100%) publications. Table 2 lists the top ten funding agencies and highest-ranking journals.
A total of 30 (4.02%) publications were supported by the National Natural Science Foundation of China and 29 were funded by the National Institutes of Health (03.88%) ( Table 2).

Journal impact factor
Impact factors (IFs) for the journals with the topcited articles ranged from 1.322 to 42.778 (median: 3.324). Overall, 52 of the top-cited studies were published in journals that had IFs above 15 (Table 2). Finally, the correlations between the number of topcited papers and journal IFs did not show any statistical significance (P>0.05).

Publication type
Overall, 1868 articles were cited 12 675 times and 1089 review papers were cited 9710 times. The articles were had a higher average citation per study (429 times) compared to the review papers, which were cited 378 times. Medicine was the most popular research category, followed by biochemistry, genetics, immunology, and microbiology. In terms of research category, there were 60 published studies that pertained to clinical research, of which 11 papers were about therapeutic vaccines (eight full papers and three protocols, including nine that pertained to phase I/II research studies and two phase III studies) (20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30). Table S2 (See Supplementary Online Information at www.celljournal.org) provides detailed information about these clinical trials.

Language and year
All of these papers were published in English from 2019 to 2020.

Country
The United States produced the most publications with 190 papers (25.40%), followed by India (126 publications, 16.84%) and China (88 papers, 11.76%). The United States ranked first in terms of gross domestic product (GDP) and articles per million population, with 0.009 articles per billion GDP (Table 3).

Co-authorship network by authors
In the Scopus database, we found 1659 articles of which there were 6545 authors. From the 6545 authors, 59 authors had at least five published papers.
The co-authorship Scopus network included 59 authors in nine clusters. However, clusters 8 and 9 included only one author (Fig.S1A, See Supplementary Online Information at www.celljournal.org). There were 2326 articles from the Dimensions database with 10 356 authors in the corpus. We selected 59 of the most cited authors who had at least five publications to compare the co-authorship network obtained for the corpora from Scopus and PubMed. All authors of the articles were selected for consideration. The co-authorship Dimensionsnetwork included 59 authors in ten clusters (Fig.S1C, See Supplementary Online Information at www.celljournal.org).

Co-authorship network by organizations
Out of 5185 organizations of the corpus in the Scopus database, five had connections and published at least five scientific papers. There were 34 organizations that had at least three publications. We identified 34 organizations that formed 22 clusters in the co-authorship network (  Table S4 (See Supplementary Online Information at www.celljournal.org) shows the first six clusters. We compared the composition of clusters of co-authorship networks by organizations obtained by the corpora from the Scopus, PubMed and Dimensions databases in Table  S4 (See Supplementary Online Information at www. celljournal.org).

Co-occurrence network map of keywords
A co-occurrence analysis of keywords was performed that displayed the existing links between keywords used in the publications. In the central part of the map, the terms most frequently encountered in publications are displayed. The keywords/terms were extracted from the title field. A term is presented as a chain of elements (nouns with definitions) with a noun at the end of the phrase [van Eck and Waltman, (31)] (Fig.  S3, See Supplementary Online Information at www. celljournal.org).
In our analysis of the corpus from Scopus, we set the threshold of the minimum number of keyword occurrences at ten. This analysis resulted in 39 keywords out of a total of 3625 (Table S5, See Supplementary Online Information at www.celljournal.org). For correct comparison, we choose the same number of terms in the corpus of the Dimensions database. A threshold for the minimum number of keyword occurrences was set at 15. The analysis resulted in 71 keywords out of a total of 4865. We used VOSviewer, which automatically extracted 40% of the least relevant terms; therefore, we chose 43 terms. The common terms "use", "India", "time", and "knowledge" were excluded from the general list (Table  S6, See Supplementary Online Information at www. celljournal.org). For correct comparison, we chose the same number of terms in the corpus of the PubMed database. We set the threshold of a minimum number of keyword occurrences at 90. The analysis resulted in 88 keywords out of a total of 26 884. VOSviewer filtered out about 40% of the terms; hence, 53 terms were chosen. We excluded 14 common terms such as "Saudi Arabia", "vitro", "South Korea", and "lesson" from the general list (Table S7, See Supplementary Online Information at www.celljournal.org). Only five terms were present in the three text corpora (COVID -19), China, novel coronavirus, prevention, nCoV) (Fig.2).
The previous experiment was limited in the number of terms. In addition, manual filtering of terms might have affected the result. So, we repeated the experiment with more terms. We choose the conditions of mapping (minimal number of occurrences, minimal cluster size) such that the number of terms approximated 450 and the number of clusters was four (Fig.S4, See Supplementary Online Information at www.celljournal.org).
For correct comparison, we choose the same number of terms in the corpora of the PubMed and Dimensions databases. A threshold of a minimum number of keyword occurrences equal to four for PubMed and three for Dimensions was set. The analysis resulted in 783 keywords out of a total of 10 784 for PubMed and 577 out of 4865 for Dimensions, except for the 40% that were deleted by VOSviewer. Finally, we chose 462 terms for PubMed and 459 terms for the Dimensions database. We COVID-19/ SARSCoV-2 Vaccine did not exclude any terms from the final list. In order to have four clusters, we set limits of at least 60 words in a cluster for PubMed and 80 for the Dimensions database. The main keywords for each of the four clusters (Top-20) from term co-occurrence maps (rank based on total link strength) are presented in Table S10 (See Supplementary Online Information at www.celljournal.org) presents a comparison of the received terms (450 units) from different corpora. The common vocabulary, general scientific vocabulary, and general medical vocabulary were excluded (Fig.3). Figure S5 (See Supplementary Online Information at www.celljournal.org) shows the relationship of the terms of each group between corpora from the different databases. The most common terms in publications related to research in the field of vaccine and coronavirus was "sars cov" in the PubMed database, which was organized based on the time of appearance (Fig.S6, See Supplementary Online Information at www.celljournal. org).

Discussion
This is the first bibliometric research that summarizes numerous characteristics of the investigations of the SARS-CoV-2 vaccines. An understanding of the features of global researches on SARS-CoV-2 vaccines may be beneficial. In this bibliometric study, we reviewed the literature from three databases -Scopus, Dimensions, and PubMed. We identified 1659 articles from Scopus and 6545 authors in the corpus, of which 59 authors had at least five publications. The co-authorship network includes these authors in nine clusters.
For 6225 articles from PubMed, 26 509 authors were listed in the corpus with the co-authorship network that includes 59 authors in 19 clusters. Moreover, for 2326 articles from the Dimensions database, 10 356 authors were in the corpus, of which 59 authors who had at least five publications with the co-authorship network of ten clusters were included. As can be deduced from the results, although there have been many studies on the SARS-CoV-2 vaccine, as well as number of vaccine showed acceptable efficacy against SARS-CoV-2; therefore, we cannot with certainty expect a fully safe and effective vaccine against this disease, especially for various age groups and various viral strains (32).
To our surprise, we found that that at most, there were 9 authors duplicated in all three corpora of the publications. Scopus and PubMed had nine, whereas Scopus and Dimensions had eight, and PubMed and Dimensions had three mutual authors. There were 33 (55.9%) non-recurring authors in the Scopus-network, 37 (62.7%) in the PubMed network, and 39 (66.1%) in the Dimensions network. It should be noted that Chinese and Indian authors prevailed among Scopus authors. There is a prevalence of European authors in PubMed and the Dimensions database is comprised of European, Chinese, and Indian authors.
To date, there are more than 300 approved candidates for the SARS-CoV-2 vaccines, and 32 have already undergone clinical trials. In addition, the paradox of vaccine production has been raised in some countries (33). Vaccines can help prevent the spread of disease by stimulating the immune system. Our knowledge of COVID-19 is far less than our ignorance, and the complexity of this disease makes us think more deeply about a vaccine. In the first stage, the vaccine is tested on a small number of subjects in order to prove that it is safe. In the second phase, the vaccine will be tested on a larger number of patients to evaluate its effectiveness. Both safety and efficacy of a vaccine are very important and vital (34).
The organizations that were identified in the mapping of co-authorship also differed for the three corpora. Organizations that met twice were: University College London, Fudan University, University of Washington, Tehran University of Medical Sciences (Tehran, Iran), Ohio State University, and Yale University. No organization met three times. In a preliminary experiment, our comparison showed considerable variation in terminology. Common words comprised disease names (COVID19, novel coronavirus, nCoV), disease prevention, and country of origin (China). Although unlikely, the COVID-19 outbreak could abruptly end before a safe and effective vaccine is available; therefore, we must continue our efforts to find such vaccines in order to be prepared to fight this disease if an outbreak recurs (35). Given that all scientists and research and development centres are in a race and competition to develop the SARS-CoV-2 vaccines, it is necessary to accelerate and streamline that process because a vaccine may be the only approach that enables the development of immunity to SARS-CoV-2 across a population (36).
As a result, scientific biomedical information databases are often used by physicians and researchers. In this article, we compared different aspects of basic biomedical scientific information databases. PubMed is a very significant resource for physicians and researchers, whereas Scopus covers a wider range of journals and citation analysis capabilities compared to the other databases (37). At the same time, both the PubMed and Dimensions corpora overlapped in eight terms. The Scopus and Dimensions corpora overlapped in 13 terms. Both the Scopus and PubMed corpora did not overlap in any of the terms. In order to clarify terminological mapping, we conducted research on a large number of terms. The results of the previous study could have been influenced by human factor because we excluded a number of uninformative terms at the last stage. In the current study, we considered all terms without exception. The main term common to the Scopus and PubMed databases was 2019-COV. The PubMed and Dimensions databases had eight terms in common (new coronavirus, novel coronavirus covid, novel coronavirus SARS-CoV, porcine deltacoronavirus, porcine epidemic diarrhoea virus, SARS CoV2, SARSCoV2 infection, and severe acute respiratory syndrome). Therefore, PubMed had more research with animal models on COVID-19 diseases compared to the other databases (38). In addition, more accurate names of this disease were used in this corpus, which was not surprising given the medical nature of this database.
We analysed the lexical groups of the terms, of which the thematic vocabulary group is of interest as it contains terms related to the field of vaccine development (39). Surprisingly, the Dimensions database had the most topicspecific words (249 words), followed by Scopus (221 words) and PubMed (193 words). However, there were COVID-19/ SARSCoV-2 Vaccine words that were repeated in the three corpora. Among the general terms, there were terms that provided an idea that to design a vaccine toolkit, animal models are essential, using approaches such as immune-informatics, virtual screening, molecular dynamics simulation, which are popular methods for vaccine development so far (40).
The structure of the coronavirus is being studied. Animal models are mentioned in the PubMed and Dimensions corpora. Hence, in term of Scopus terms, penetration of the virus into the cell and the body's response to the virus were frequent terms. PubMed has frequent immunological terms, but the largest number of specific terms were identified in the Dimensions database. Thus, the corpus from the Dimensions database provided a more complete picture of the research topics for SARS-CoV-2 vaccine development. This could help health policy makers make decisions about incorporating new researches into vaccine development (41).
We attempted to analyse temporal dynamics using the PubMed corpus as an example by taking into account the medical specifics of this database because these collections were less represented in the Dimensions and Scopus databases. The time slot is wider in the PubMed database. Unlike other databases, PubMed contains many articles about previous strains of coronaviruses. Nevertheless, the interval of publication activity was small. Hence, we decided to consider temporal dynamics in future studies.
There were 10874 terms in the PubMed collection. We chose 147 terms because they were the most relevant and popular. The clustering of these terms resulted in ten clusters. There were eight clusters with the minimum cluster size that equalled one.

Conclusion
This study is a scientific bibliometric analysis of studies on SARS-CoV-2 vaccines. A comparative analysis of scientific literature was carried out for three bases: Scopus, PubMed, and Dimensions. As a result, we determined that tremendous attention is paid to the study of the coronavirus structure at the organismic, cellular and molecular levels. Penetration of the virus into the cells is well-studied. A variety of methods and strategies are being used to develop a vaccine. The features of the influence and development of coronavirus on animals are well understood. Both animal and computer models are being used to create a vaccine for humans. The most representative from the point of view of the coverage of research on the creation of a vaccine against coronavirus is in the Dimensions database. It is characterized by a large number of formed verbose terms (more than four words) related to coronavirus, which makes it possible to track trends in the development of methods for creating a vaccine.